Automatic Metadata Extraction and Metadata Controlled Production Process

ABSTRACT

Metadata generated at the outset of an audio visual program, such as a television undergoes transmission to a field device associated with a capture device operated by production personnel, such as a news reporter and/or a videographer to capture one of audio and/or video information. The production personnel will typically edit the metadata for incorporation into the file structure of audio and/or visual information captured by the capture device. A server  42  receives and updates the original metadata using the metadata in the file structure of the capture audio and/or video information, thus effectively harvesting the original metadata.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority under 35 U.S.C. 119(e) to U.S. Provisional Patent Application Ser. No. 60/713,848, filed Sep. 2, 2005, the teachings of which are incorporated herein.

TECHNICAL FIELD

This invention relates to a technique for capturing and processing metadata during production of an audio visual program such as a television news story.

BACKGROUND ART

Advances in the development of television production equipment allow recording devices like video recorders, servers and camcorders, for example, to record not only audio and/or video information, but metadata as well. Such metadata comprises data about the captured audio and/or video information. Such metadata can include simple information, such as the time and date of the capture of the associated audio and/or video information. More complex metadata can include identification of the content of the audio and/or video information, as well as data associated with authoring and editing of that content.

Techniques for capturing metadata in conjunction with the capture of audio and/or video information, as well as techniques for associating such metadata have become well known. What has proven elusive is the ability to make good use of the metadata. In other words, the problem facing production personnel is the generation of “useful” metadata that can assist production personnel, rather than the creation of metadata that simply gets stored without aiding in the production process.

Thus, a need exists for a technique for generating and associating useful metadata in conjunction with the production of audio-visual content, and particularly, event driven audio visual content, such a television news material.

BRIEF SUMMARY OF THE INVENTION

Briefly, in accordance with an illustrative embodiment of the present principles, there is provided a technique for associating metadata with at least one of audio and video information. The method commences by transmitting metadata to a field device associated with the capture of the at least one of the audio and video information so that the metadata can be used by an operator of that device. The metadata is also transmitted to a storage mechanism, such as a server or the like, destined to receive the at least one of the audio and video information along with edited metadata, such as edited metadata received from the field device. The metadata received at the server undergoes updating in accordance with the edited metadata. In this way metadata created at the outset of production can undergo editing by field personnel in association with the capture of the audio and/or video information

In practice, the edited metadata from the field can serve to update the original metadata at a server that receives the audio and/or video information captured in the field by a capture device, such as a camcorder to the like. Thus, the updated metadata stored at the server will provide information useful for association with essence objects captured by the capture device to enable secure identification of such objects.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts a block schematic diagram of a system for audio visual production illustrating metadata flow in accordance with the present principles.

DETAILED DESCRIPTION

FIG. 1 depicts a block schematic diagram of a system 10, in accordance with the present principles for capturing useful metadata in connection with the capture of audio and/or video information. In the illustrated embodiment, the system 10 takes the form of a system for the production of television news although the metadata capture technique of the present principles has application to other systems. The solid lines within FIG. 1 indicate metadata flow whereas the dashed blocks indicate the type of metadata added and/or transmitted.

The system of FIG. 10 comprises a programmed processor 12, typically in the form of a news room computer system (NRCS) capable of generating assignment information for use by personnel who gather news stories. Newsroom computer systems capable of performing this function are available from manufacturers of automated news systems. The assignment information generated by the processor 12 can include metadata associated with a news story to be pursued by news gathering personnel. Such metadata can originate from a variety of sources. For example, the processor 12 can receive metadata from one or more news wires services 14 (only one of which is shown), such as Reuters, Associated Press, United Press International, for example. Metadata from the newswire service 14 can include names, story location(s) any other information, as indicated by the legend in block 16, for attachment to a story assignment generated by the processor 12. In practice, the processor 12 can also receive metadata from an assignment editor 18, an individual who creates and edits assignment information for the news gathering personnel. The assignment editor 18 will typically add metadata from a file, sometimes referred to as a “tickler” file, that can include the names of individuals for interview and story locations, as indicated by the legend in block 20. In addition, the processor 12 can receive status metadata, as indicated by the legend in block 21.

In the process of generating assignment information, the processor 12 provides assignment information, in the form of an assignment grid 22 that designates which news gathering personnel handle which stories. The processor 12 communicates the assignment information and metadata to a server 24 which also stores information. The server 24 can also store metadata as received through an interface 25, designated by the legend “mailer”. Such status information can include slug data (i.e., data related to slugs which constitute blank space reserved for future stories), story locations, names, set-up times, show times as indicated by the legend in block 26.

A communications network 28, typically a wireless communications network transmits metadata from the server 24 to a field device 30, such as a laptop or personal computer for example, associated with a reporter 32 and videographer (cameraman) 34 assigned to a particular news story. The server 24 makes use of the assignment data contained in the assignment grid 22 to transmit appropriate metadata to the corresponding field device 30 so that the reporter 32 and videographer receive information related stories to which they have been assigned. In other words, the server 24 provides the reporter 32 and videographer 34 with instructions where to go, whom to interview, background wire stories from the wire service 14, notes from previous coverage and other information relevant to the assigned story or stories. Additionally, the metadata from the server 24 also includes the identity of the specific story assigned to a particular news team (e.g., the reporter 32 and videographer 34). This identity is usually referred to by the term Media Object Server Identification or MOS-ID. Using the field device 30, the reporter 32 can add and/or edit the metadata transmitted from the server 24. For example, the reporter 32 can add information related the identity (e.g., the name) of individuals interviewed as well as notes created by the reporter, as indicated by the legend in block 36. The videographer 34 can add metadata indicative of preferred settings and/or other information related to the captured image, as indicated by the legend in block 38. Although FIG. 1 depicts both the reporter 32 and videographer 34, a single individual could easily serve both functions.

The field device 30 communicates the metadata it has received to a capture device 40 that captures at least one of audio and/or video information. In practice, the capture device 40 takes the form of a camcorder, such as the Infinity series camcorder available from Thomson Grass Valley, although the capture device could take comprise the combination of a video camera and a recording device, such as a videotape recorder, a video disc recorder or a server. The metadata received by the capture device 40 from the field device 30 can also include one of more of the following: Global Positioning Satellite information, compass bearings, lens settings, aspect ratio data, as well as any other data generated by the field device 30 and/or entered manually by one or more of the reporter 32 and videographer 34. Such metadata is indicated by the legend in block 41. The metadata can also include information entered directly to the capture device 40 by the videographer 34 through entry of one or more keys (not shown) on the capture device. Note that while the field device 30 and the capture device 40 are depicted as separate elements, it is possible that a single unit could serve the functions of the capture device and the field device.

The metadata from the field device 30 sent to the capture device 40, along with metadata entered to the capture device by the videographer 34, gets entered into the file structure of the captured audio and/or video information. Typically the file structure comprises the Media Exchange File or MXF structure but other file structures could be used. Entering such metadata into the file structure of the captured audio and/or visual information in the manner described resolves the long-standing metadata paradox, namely, how to create metadata at the outset of creating an audio-visual program, such as a news program. As discussed, the metadata incorporated in the file structure of the captured audio and/or video information captured by the capture device 40 includes the metadata already previously created by the assignment editor 18 and the wire service 14. Thus, the field device 30 simply “harvests” metadata already existing on the server 24.

The audio and/or video information captured by the capture device, and the metadata entered into the file structure of such information undergoes downloading from capture device 40 to a storage mechanism 42. In the illustrative embodiment of FIG. 1, the storage mechanism 42 bears the designation “Ingest Station (IS)” because it serves to ingest (receive) the audio and/or video information and associated metadata from the capture device 40. In addition to receiving metadata from the field device 30 via the audio and/or video information downloaded from the capture device 40, the storage mechanism 42 also receives the same metadata originally sent to the field device 30 by the server 24, including the MOS-ID identifying the server. Additional metadata, such as metadata related to stories, slugs schedules and estimated time of arrival (ETA), as indicated by the legend in block 44, can be added to the metadata from the processor 12, Further, the storage mechanism 42 can also metadata from an external source (not shown) related to show schedules as indicated by the legend in box 45.

The storage mechanism 42 has the ability to match the metadata received from the capture device 40 with the metadata received from the server 24 by matching the MOS-ID of the metadata received from the various sources. In particular, the storage mechanism 42 will look for the MOS-ID comprising part of the metadata in the file structure of the audio and/or video information downloaded by the capture device 40 to match it with the MOS-ID in the metadata received from the server 24. In this way, the storage mechanism 42 can know what to do with such metadata. More importantly, the server 42 can update the status information associated with audio and/or video information (e.g., news stories) created via the system 10 based on the updating of the metadata created by the processor 12 with edited metadata from the capture device 40.

Many benefits result from using the metadata created and/or edited entered by one or both of the reporter 32 and videographer 34 to update the original metadata stored in the server 24. For example, an editor using a news edit block, comprised of editing software running on a processor with an associated terminal, can edit audio and/or video information downloaded to the storage mechanism 42 using the metadata information to determine not only the identity of the reporter and videographer assigned to the story, but also the status of the story, including, but not limited to the estimated time of arrival (ETA) for material not yet downloaded.

Extraction and use of the metadata can occur elsewhere in production process. For example, consider the circumstance when a reporter 32 and videographer 34 have the task of interviewing a certain individual, say John Doe. During the interview process, the cameraman 32 and/or videographer 34 can check John Doe's name against the list of names in the assignment information comprising part of the metadata transmitted to the field device 30. If the name matches, then the cameraman 32 or videographer can add the name to one or more of the video frames captured by the capture device 40. The non-linear editor (NLE) program running on the news edit block 48 or elsewhere in the system 10 of FIG. 1 can enter the name into a character generator (CG) template with one more commands, thus avoiding spelling or transcription errors. This will also reduce the likelihood of misidentification of an interview subject.

The server 42 serves as the source of metadata for the news edit block 48. Additionally, the server 42 as serves as the metadata source for a simple data base (SDB), not shown that stories a play list of stories edited via the news edit block 48. Additionally the server 42 can also provide metadata to a news room computer system, an example of which is the Electronic News Room Product System (ENPS) available from Associated Press. The news edit block 48 not only receives the both audio and/or video information and associated metadata from the server 42 but also receives metadata associated with stories, slugs running order computer graphics production techniques and the like, as indicated by the legend in block 51. Such metadata can originate from one of several sources (not shown) such as an news room computer system (not shown) and or an integrated television production system (IPS) (not shown). The new edit terminal can provide edited news story clips (e.g., edited audio and/or video information), to a playback unit 50. In addition, the new edit terminal 48 can supply edited news clips (e.g., edited audio and/or video information) and accompanying metadata to the IPS. The edited news clips (and accompany metadata) provided by the server 42 to the Integrated production System can include information, including metadata, from a character generator (not shown) as indicated by the legend in block 53. The new edit block 48 can also supply the edited news story clips to an asset management system (not shown) for other distribution. Such an asset management system can comprise the “NewsBrowse” system available from Thomson Grass Valley.

The playback unit 50 not only receives the news clips from the new block 48 but also receives audio and/or video information from the server 42 and from the IPS. The information from the IPS can also include metadata representing status information, as indicated by the legend in block 54. The playback unit 50 will feed edited news clips from the new edit terminal 48, as well as audio and/or video information from the server 42 and the IPS, to one or more of the news edit block 48, the news room computer system, the asset management system (e.g., NewsBrowse), an editor (not shown) and a switcher/production system (PS) 57. The clips provided by the playback system 50 to the news edit block 48 can include metadata associated with the slugs and their running order as indicated by the legend in block 56. The clips provided by the playback system 50 to the news room computer system, the editor and/or asset management system (NewsBrowse) can include status metadata, as indicated by the legend in block 58. Metadata typically in the form of status information, as indicated by the legend in block 59, can accompany the audio and/or video information received by the switcher/production system 57 from the payback unit 52. Metadata, typically containing formation related to stories, slugs, running order on-air talent graphics special effects, character generator data, production techniques, camera data and the like, as indicated by the legend in block 60, can accompany assignment information and other data received by the switcher/production system 57 from the server 24. Metadata, typically in the form of character generator proxy and insertion information, as indicated by the legend in block 62, can accompany the audio and/or video information, and other metadata, received by the switcher/production system 56 from the news edit terminal 48.

The switcher/production system 57 can supply audio and/or video information (and accompanying metadata) to the playback unit 50 and to other system devices (not shown). The audio and/or video information supplied by the switcher/production system 57 to the playback unit 50 can include metadata containing status information; such information associated with released audio and/or video information, as indicated by the legend in block 64. The audio and/or video information and accompanying metadata supplied from the switcher/production system 57 to the other system devices, can include metadata related to Global Positioning Satellite data and lens information for the capture device 40 or other such devices (not shown), graphics and character generator information, aspect ratio to the switcher/production system, for example, as indicated by the legend in block 66.

The foregoing describes a technique capturing and processing metadata during production of an audio visual program. 

1. A method for associating metadata with audio visual information, comprising the steps of: transmitting first metadata to a field device prior to capture of audio visual information; transmitting second metadata to a server destined to receive the audio visual information along with edited first metadata from the field device following capture of the audio visual information; updating the second metadata received at the server in accordance with the first edited metadata.
 2. The method according to claim 1 wherein the first and second metadata are the same.
 3. The method according to claim 1 further comprises the step of operating the field device to edit the first metadata transmitted to the field device.
 4. The method according to claim 1 wherein the first transmitting step comprises the step of communicating the first metadata to the field device over a wireless communications network.
 5. The method according to claim 1 wherein the step of updating the second metadata received at the server further comprises the steps of: assigning an identification to the first metadata transmitted to the field device; assigning the same identification to the second metadata transmitted to the server; and matching the edited first metadata from the field device with the second metadata transmitted to the server using the identification.
 6. The method according to claim 1 wherein the first metadata includes information related to at least one of assignment data, location data, personnel data, story data, and individuals for interview data.
 7. The method according to claim 3 wherein the step of operating the field device to edit the first metadata transmitted thereto comprises the step of modifying the first metadata to add information gathered by a news reporter at a story location.
 8. The method according to claim 3 wherein the step of operating the field device to edit the first metadata comprises the step of modifying the first metadata to add information regarding capture of an image.
 9. The method according to claim 1 further including the steps of: incorporating edited first metadata from the field device into one of audio and/or video information from a capture device; and transmitting the one of the audio and/or video information incorporating the first edited metadata to the server.
 10. A method for associating metadata with audio visual information, comprising the steps of: transmitting metadata to a field device prior to capture of associated with audio visual information; the metadata undergoing editing at the field device by an operator; transmitting the metadata to a server destined to receive the captured audio visual information along with edited metadata from the field device; updating the metadata received at the server in accordance with the edited metadata from the field device.
 11. The method according to claim 10 wherein the first transmitting step comprises the step of communicating the metadata to the field device over a wireless communications network.
 12. The method according to claim 10 wherein the step of updating the metadata received at the server further comprises the steps of: assigning an identification to the metadata transmitted to the field device; assigning the same identification to the metadata transmitted to the server; and matching the edited first metadata from the field device with the second metadata transmitted to the server using the identification.
 13. The method according to claim 10 wherein the metadata transmitted to the field device includes information related to at least one of assignment data, location data, personnel data, story data, and individuals for interview data.
 14. The method according to claim 1 further including the steps of: incorporating edited metadata from the field device into one of audio and/or video information from a capture device; and transmitting the one of the audio and/or video information incorporating the first edited metadata to the server. 